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(54) Apparatus and method for cataloguing symbolic data for use in performance analysis of 
computer programs 



(57) An apparatus and method for cataloging sym- 
bolic data for use in performance analysis of computer 
programs is provided. The apparatus and method stores 
symbolic data for loaded modules during or shortly after 
a performance trace and utilizes the stored symbolic da- 
ta when performing a performance analysis at a later 
time. A merged symbol file is generated for a computer 
program, or application, under trace. The merged sym- 
bol file contains information useful in performing sym- 
bolic resolution of address information in trace files for 
each instance of a module. During post processing of 



the trace information generated by a performance trace 
of a computer program, symbolic information stored in 
the merged symbol file is compared to the trace infor- 
mation stored in the trace file. The correct symbolic in- 
formation in the merged symbol file for loaded modules 
is identified based a number of validating criteria. The 
correct symbolic information for the loaded modules 
may then be stored as an indexed database that is used 
to resolve address information into corresponding sym- 
bolic information when providing the trace information 
to a display for use by a user. 
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Description 

Technical Field of the Invention 

[0001] The present invention is directed to an appa- 
ratus and method for cataloging symbolic data for use 
in performance analysis of computer programs. 

Background of the invention 

[0002] In analyzing and enhancing performance of a 
data processing system and the applications executing 
within the data processing system, It is helpful to know 
which software modules within a data processing sys- 
tem are using system resources. Effective management 
and enhancement of data processing systems requires 
knowing how and when various system resources are 
being used. Performance tools are used to monitor and 
examine a data processing system to determine re- 
source consumption as various software applications 
are executing within the data processing system. For ex- 
ample, a performance tool may identify the most fre- 
quently executed modules and instructions in a data 
processing system, or may identify those modules 
which allocate the largest amount of memory or perform 
the most I/O requests. Hardware performance tools may 
be built into the system or added at a later point in time. 
[0003] Software performance tools are also useful in 
data processing systems, such as personal computer 
systems, which typically do not contain many, if any, 
built-in hardware performance tools. One known soft- 
ware performance tool is a trace tool. A trace tool may 
use more than one technique to provide trace data that 
indicates execution flows for an executing program. One 
technique keeps track of particular sequences of in- 
structions by logging certain events as they occur, so- 
called event-based profiling technique. For example, a 
trace tool may log every entry into, and every exit from, 
a module, subroutine, method, function, or system com- 
ponent. Alternately, a trace tool may log the requester 
and the amounts of memory allocated for each memory 
allocation request. 

[0004] Typically, a time-stamped record, where "time" 
is defined as any monftonically increasing metric, such 
as, number of instructions executed, is produced for 
each such event. Corresponding pairs of records similar 
to entry-exit records also are used to trace execution of 
arbitrary code segments, starting and completing I/O or 
data transmission, and for many other events of interest. 
[0005] In order to improve performance of code gen- 
erated by various families of computers, it is often nec- 
essary to determine where time is being spent by the 
processor in executing code, such efforts being com- 
monly known in the computer processing arts as locat- 
ing "hot spots.** Ideally, one would like to isolate such 
hot spots at the instruction and/or source line of code 
level in order to focus attention on areas which might 
benefit most from improvements to the code. 



[0006] Another trace technique involves periodically 
sampling a program's execution flows to identify certain 
locations in the program in which the program appears 
to spend large amounts of time. This technique is based 
5 on the idea of periodically interrupting the application or 
data processing system execution at regular intervals, 
so-called sample-based profiling. At each interruption, 
information is recorded for a predetermined length of 
time or for a predetermined number of events of interest. 
10 For example, the program counter of the currently exe- 
cuting thread may be recorded during the intervals. 
These values may be resolved against a load map and 
symbol information for the data processing system at 
analysis time, and a profile of where the time is being 

15 spent may be obtained from this analysis. 

[0007] For example, isolating such hot spots to the in- 
struction level may identify significant areas of sub-op- 
timal code which helps performance analysts focus their 
attention on improving the performance of the "impor- 

20 tant" code. This may also help compiler writers to focus 
their attention on improving the efficiency of the gener- 
ated code. This is especially true for " Jitted" code (which 
is described later in this application). Another potential 
use of instruction level detail is to provide guidance to 

25 the designer of future systems. Such designers employ 
profiling tools to find characteristic code sequences and/ 
or single instructions that require optimization for the 
available software for a given type of hardware. 
[0008] Data processing system applications are typi- 

30 cally built with symbolic data and may even be shipped 
to client devices with symbolic data still present in the 
modules. Symbolic data is, for example, alphanumeric 
representations of application module names, subrou- 
tine names, function names, variable names, and the 

35 like. 

[0009] The application is comprised of modules writ- 
ten as source code in a symbolic language, such as 
FORTRAN or C++, and then converted to a machine 
code through compilation of the source code. The ma- 

40 chine code is the native language of the computer. In 
order for a program to run, it must be presented to the 
computer as binary-coded machine instructions that are 
specific to that CPU model or family. 
[0010] Machine language tells the computer what to 

45 do and where to do it. When a programmer writes: total 
= total + subtotal, that statement is converted into a ma- 
chine instruction that tells the computer to add the con- 
tents of two areas of memory where TOTAL and SUB- 
TOTAL are stored. 

50 [0011] Since the application is executed as machine 
code, performance trace data of the executed machine 
code, generated by the trace tools, is provided in terms 
of the machine code, i.e. process identifiers, addresses, 
and the like. Thus, it may be difficult for a user of the 

55 trace tools to identify the modules, instructions, and 
such, from the pure machine code representations in the 
performance trace data. Therefore, the trace data must 
be correlated with symbolic data to generate trace data 
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that is easily interpreted by a user of the trace tools. 
[0012] The symbolic data with which the trace data 
must be correlated may be distributed amongst a plu- 
rality of files. For example, the symbolic data may be 
present in debug files, map files, other versions of the 
application, and the like. In the known performance tool 
systems, in order to correlate the symbolic data with the 
performance trace data, the performance tool must 
know the locations of one or more of the sources of sym- 
bolic data and have a complex method of being able to 
handle redundancies in the symbolic data. 
[0013] In addition, such correlation is typically per- 
formed during post-processing of the performance trace 
data. Thus, an additional separate step is required for 
converting performance trace data into symbolic repre- 
sentations that may be comprehended by a perform- 
ance analyst. 

[001 4] The conversion of performance trace data into 
symbolic representations is performed at a time that 
may be remote to the time that the performance trace is 
performed. As a result, the symbolic data may not be 
consistent with the particular version of the computer 
program executed during the trace. This may be due to 
the fact that, for example, a newer version of the appli- 
cation was executed during the trace and the symbolic 
data corresponds to an older version of the application. 
[0015] This may be especially true for applications 
whose symbolic data is maintained at a supplier's loca- 
tion with the machine code being distributed to a plurality 
of clients. In such a case, the supplier may continue to 
update the symbolic data, i.e. create new versions of the 
application, but fail to provide the newest version of the 
application to all of the clients. In this scenario, if a per- 
formance trace were to be performed, the symbolic data 
maintained by the supplier may not be the same version 
as the machine code on which the performance trace is 
performed. 

[0016] Thus, it would be beneficial to have a mecha- 
nism by which symbolic data for a plurality of sources 
may be combined into a single source of symbolic data 
for an application undergoing performance analysis and 
being traced. It would further be beneficial to have a 
mechanism for verifying the symbolic data as corre- 
sponding to the same version of the application under- 
going performance analysis and being traced. Addition- 
ally, it would be beneficial to have a mechanism that al- 
lows for symbolic resolution to be performed as an inte- 
grated operation to the performance trace of the appli- 
cation. 

DISCLOSURE OF THE INVENTION 

[0017] The present invention provides an apparatus 
and method for cataloging symbolic data for use in per- 
formance analysis of computer programs. In particular, 
the present invention provides an apparatus and meth- 
od of storing symbolic data for executable modules. The 
symbolic data is used when performing a performance 



trace. 

[0018] The present invention includes a mechanism 
by which a merged symbol file is generated for a com- 
puter program, or application, under trace. The merged 

5 symbol file contains information useful in performing 
symbolic resolution of address information in trace files 
for each instance of a module. 
[001 9] During post processing of the trace information 
generated by a performance trace of a computer pro- 

io gram, symbolic information stored in the merged symbol 
file is compared to the trace information stored in the 
trace file. The post processing typically occurs shortly 
after the trace or at some remote time after the trace of 
the computer program. 

15 [0020] The trace information includes information 
identifying the modules that are loaded during the trace 
of the computer application. This trace information and 
the merged symbol file are used to produce reports. The 
correct symbolic information in the merged symbol file 

2Q for the loaded modules is identified based on a number 
of validating criteria. Alternatively, the correct symbolic 
information in the merged symbol file for the modules 
used in the trace, or interrupted in the case of profiling, 
is identified based on a number of validating criteria. 

25 [0021] The correct symbolic information for the re- 
quired modules may then be stored as an indexed da- 
tabase that is indexed, for example, by process and ad- 
dress identifiers. The indexed database of symbolic in- 
formation may be stored as a separate file or as a sep- 

30 arate portion of a trace file for the computer application. 
This indexed database may then be used to resolve ad- 
dress information into corresponding symbolic informa- 
tion when providing the trace information for use by a 
user, such as a performance analyst. 

35 

BRIEF DESCRIPTION OF THE DRAWINGS 

[0022] The present invention will now be described by 
way of example only, with reference to the accompany- 
40 ing drawings, in which: 

Figure 1 is an exemplary block diagram of a distrib- 
uted data processing system according to the 
present invention; 

45 

Figure 2A is an exemplary block diagram of a data 
processing system according to the present inven- 
tion; 

50 Figure 2B is an exemplary block diagram of a data 
processing system according to the present inven- 
tion; 

Figure 3A is a block diagram illustrates the relation- 
55 ship of software components operating within a 
computer system that may implement the present 
invention; 
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Figure 3B is an exemplary block diagram of a Java 
Virtual Machine (JVM) according to the present in- 
vention; 

Figure 4 is a block diagram depicting components 5 
used to profile processes in a data processing sys- 
tem; 

Figure 5 is an illustration depicting various phases 
in profiling the active processes in an operating sys- 10 
tern; 

Figure 6 is an exemplary diagram illustrating a time 
sequence of events according to the present inven- 
tion; 15 

Figure 7 is a flowchart depicting an exemplary op- 
eration of a trace program for generating trace 
records from processes executing on a data 
processing system; 20 

Figure 8 is a flowchart depicting an exemplary op- 
eration of a system interrupt handler trace hook; 

Figure 9 is an exemplary diagram illustrating the 25 
generation of a merged symbol file In accordance 
with the present invention; 



ance trace data stored in the trace buffer in a dy- 
namic manner; 

Figure 16 is a flowchart outlining an exemplary op- 
eration of the present invention when verifying the 
symbolic data and loaded module information; 

Figure 17 is a flowchart outlining an exemplary op- 
eration of the present invention when obtaining the 
best match module entry from the merged symbol 
file; and 

Figure 18 is a flowchart outlining an exemplary op- 
eration of the present invention when generating a 
display of symbolic trace data; 

Figure 19 is an exemplary diagram of a portion of 
a typical Basic Block File (.bbf) for a computer pro- 
gram; 

Figure 20 is an exemplary diagram of a portion of 
a .bbf for a computer program in accordance with 
the present invention; and 

Figure 21 is a flowchart outlining an exemplary op- 
eration of a further embodiment of the present in- 
vention. 



Figure 1 0A is an exemplary diagram illustrating the 
organization of a merged symbol file in accordance 30 
with the present invention; 

Figure 10B is an exemplary diagram of a merged 
symbol file; 

35 

Figure 11 is an exemplary diagram of performance 
trace data that may be stored as a trace file or main- 
tained in the trace buffer; 

Figure 1 2 is an exemplary diagram of a Module Ta- 40 
ble Entry file in accordance with the present inven- 
tion; 

Figure 13A is an exemplary diagram of an indexed 
database according to the present invention; 45 

Figure 13B is a flowchart outlining an exemplary 
operation of a post-processor for generating an in- 
dexed database based on the MTE data and the 
merged symbol file; so 

Figure 14 is a flowchart outlining an exemplary op- 
eration of the present invention when generating an 
indexed database of symbolic data; 

55 

Figure 15 is a flowchart outlining an exemplary op- 
eration of the present invention when generating an 
indexed database of symbolic data from perfornv 



DETAILED DESCRIPTION OF THE INVENTION 

[0023] With reference to Figure 1 , a pictorial repre- 
sentation of a distributed data processing system in 
which the present invention may be implemented is de- 
picted. Distributed data processing system 100 is a net- 
work of computers in which the present invention may 
be implemented. Distributed data processing system 
100 contains a network 102, which is the medium used 
to provide communications links between various devic- 
es and computers connected together within distributed 
data processing system 100. Network 102 may include 
permanent connections, such as wire or fiber optic ca- 
bles, or temporary connections made through telephone 
connections. 

[0024] In the depicted example, a server 104 is con- 
nected to network 102 along with storage unit 106. In 
addition, clients 108, 110, and 112 also are connected 
to a network 102. These clients 108, 110, and 112 may 
be, for example, personal computers or network com- 
puters. For purposes of this application, a network com- 
puter is any computer, coupled to a network, which re- 
ceives a program or other application from another com- 
puter coupled to the network. In the depicted example, 
server 104 provides data, such as boot files, operating 
system images, and applications to clients 108-112. Cli- 
ents 108, 110, and 112 are clients to server 104. Dis- 
tributed data processing system 100 may include addi- 
tional servers, clients, and other devices not shown. In 
the depicted example, distributed data processing sys- 
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tern 100 is the Internet with network 102 representing a 
worldwide collection of networks and gateways that use 
the TCP/IP suite of protocols to communicate with one 
another. At the heart of the Internet is a backbone of 
high-speed data communication lines between major 
nodes or host computers, consisting of thousands of 
commercial, government, educational, and other com- 
puter systems, that route data and messages. Of 
course, distributed data processing system 100 also 
may be implemented as a number of different types of 
networks, such as, for example, an Intranet or a local 
area network. 

[0025] With reference now to Figure 2A, a block dia- 
gram of a data processing system which may be imple- 
mented as a server, such as server 104 in Figure 1 , is 
depicted in accordance to the present invention. Data 
processing system 200 may be a symmetric multiproc- 
essor (SMP) system including a plurality of processors 
202 and 204 connected to system bus 206. Alternative- 
ly, a single processor system may be employed. Also 
connected to system bus 206 is memory controller/ 
cache 208, which provides an interface to local memory 
209. I/O Bus Bridge 21 0 is connected to system bus 206 
and provides an interface to I/O bus 212. Memory con- 
troller/cache 208 and I/O Bus Bridge 210 may be inte- 
grated as depicted. 

[0026] Peripheral component interconnect (PCI) bus 
bridge 214 connected to I/O bus 212 provides an inter- 
face to PCI local bus 216. A modem 218 may be con- 
nected to PCI local bus 216. Typical PCI bus implemen- 
tations will support four PCI expansion slots or add-in 
connectors. Communications links to network comput- 
ers 108-112 in Figure 1 may be provided through mo- 
dem 21 8 and network adapter 220 connected to PCI lo- 
cal bus 216 through add-in boards. 
[0027] Additional PCI bus bridges 222 and 224 pro- 
vide interfaces for additional PCI buses 226 and 228, 
from which additional modems or network adapters may 
be supported. In this manner, server 200 allows connec- 
tions to multiple network computers. A memory mapped 
graphics adapter 230 and hard disk 232 may also be 
connected to I/O bus 212 as depicted, either directly or 
indirectly. 

[0028] Those of ordinary skill in the art will appreciate 
that the hardware depicted in Figure 2A may vary. For 
example, other peripheral devices, such as optical disk 
drive and the like also may be used in addition or in place 
of the hardware depicted. The depicted example is not 
meant to imply architectural limitations with respect to 
the present invention. 

[0029] The data processing system depicted in Fig- 
ure 2A may be, for example, an IBM RISC/System 6000 
system, a product of International Business Machines 
Corporation, running the Advanced Interactive Execu- 
tive (AIX) operating system. 

[0030] With reference now to Figure 2B, a block dia- 
gram of a data processing system in which the present 
invention may be implemented is illustrated. Data 



processing system 250 is an example of a client com- 
puter. Data processing system 250 employs a peripher- 
al component interconnect (PCI) local bus architecture. 
Although the depicted example employs a PCI bus, oth- 

5 er bus architectures such as Micro Channel and ISA 
may be used. Processor 252 and main memory 254 are 
connected to PCI local bus 256 through PCI Bridge 258. 
PCI Bridge 258 also may include an integrated memory 
controller and cache memory for processor 252. Addi- 

10 tional connections to PCI local bus 256 may be made 
through direct component interconnection or through 
add-in boards. In the depicted example, local area net- 
work (LAN) adapter 260, SCSI host bus adapter 262, 
and expansion bus interface 264 are connected to PCI 

15 local bus 256 by direct component connection. In con- 
trast, audio adapter 266, graphics adapter 268, and au- 
dio/video adapter (A/V) 269 are connected to PCI local 
bus 266 by add-in boards inserted into expansion slots. 
Expansion bus interface 264 provides a connection for 

20 a keyboard and mouse adapter 270, modem 272, and 
additional memory 274. SCSI host bus adapter 262 pro- 
vides a connection for hard disk drive 276, tape drive 
278, and CD-ROM 280 in the depicted example. Typical 
PCI local bus implementations will support three or four 

25 pel expansion slots or add-in connectors. 

[0031] An operating system runs on processor 252 
and is used to coordinate and provide control of various 
components within data processing system 250 in Fig- 
ure 2B. The operating system may be a commercially 

30 available operating system such as JavaOS For Busi- 
ness or OS/2, which are available from International 
Business Machines Corporation. JavaOS is loaded from 
a server on a network to a network client and supports 
Java programs and applets. A couple of characteristics 

35 of JavaOS that are favorable for performing traces with 
stack unwinds, as described below, are that JavaOS 
does not support paging or virtual memory. An object 
oriented programming system such as Java may run in 
conjunction with the operating system and may provide 

40 calls to the operating system from Java programs or ap- 
plications executing on data processing system 250. in- 
structions for the operating system, the object-oriented 
operating system, and applications or programs are lo- 
cated on storage devices, such as hard disk drive 276 

45 and may be loaded into main memory 254 for execution 
by processor 252. Hard disk drives are often absent and 
memory is constrained when data processing system 
250 is used as a network client. 
[0032] Those of ordinary skill in the art will appreciate 

so that the hardware in Figure 2B may vary depending on 
the implementation. For example, other peripheral de- 
vices, such as optical disk drives and the like may be 
used in addition to or in place of the hardware depicted 
in Figure 2B. The depicted example is not meant to im- 

55 pry architectural limitations with respect to the present 
invention. For example, the processes of the present in- 
vention may be applied to a multiprocessor data 
processing system. 
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[0033] The present invention provides a method and 
system for processing performance trace data of soft- 
ware applications. Although the present invention may 
operate on a variety of computer platforms and operat- 
ing systems, it may also operate within an interpretive 
environment, such as a REXX, Smalltalk, or Java runt- 
ime environment, and the like. For example, the present 
invention may operate in conjunction with a Java virtual 
machine (JVM) yet within the boundaries of a JVM as 
defined by Java standard specifications. In order to pro- 
vide a context for the present invention with regard to 
an exemplary interpretive environment, portions of the 
operation of a JVM according to Java specifications are 
herein described. 

[0034] With reference now to Figure 3A, a block dia- 
gram illustrates the relationship of software components 
operating within a computer system that may implement 
the present invention. Java-based system 300 contains 
platform specific operating system 302 that provides 
hardware and system support to software executing on 
a specific hardware platform. JVM 304 is one software 
application that may execute in conjunction with the op- 
erating system. JVM 304 provides a Java run-time en- 
vironment with the ability to execute Java application or 
applet 306, which is a program, servlet, or software com- 
ponent written in the Java programming language. The 
computer system in which JVM 304 operates may be 
similar to data processing system 200 or computer 100 
described above. However, JVM 304 may be imple- 
mented in dedicated hardware on a so-called Java chip, 
Java-on-silicon, or Java processor with an embedded 
picoJava core. At the centre of a Java run-time environ- 
ment is the JVM, which supports ail aspects of Java's 
environment, including its architecture, security fea- 
tures, mobility across networks, and platform independ- 
ence. 

[0035] The JVM is a virtual computer, i.e. a computer 
that is specified abstractly The specification defines 
certain features that every JVM must implement, with 
some range of design choices that may depend upon 
the platform on which the JVM is designed to execute. 
For example, all JVMs must execute Java bytecodes 
and may use a range of techniques to execute the in- 
structions represented by the bytecodes. A JVM may be 
implemented completely in software or somewhat in 
hardware. This flexibility allows different JVMs to be de- 
signed for mainframe computers and PDAs. 
[0036] The JVM is the name of a virtual computer 
component that actually executes Java programs. Java 
programs are not run directly by the central processor 
but instead by the JVM , which is itseff a piece of software 
running on the processor. The JVM allows Java pro- 
grams to be executed on a different platform as opposed 
to only the one platform for which the code was com- 
piled. Java programs are compiled for the JVM. In this 
manner, Java is able to support applications for many 
types of data processing systems, which may contain a 
variety of central processing units and operating sys- 



tems architectures. To enable a Java application to ex- 
ecute on different types of data processing systems, a 
compiler typically generates an architecture-neutral file 
format - the compiled code is executable on many proc- 

5 essors, given the presence of the Java run-time system. 
[0037] The Java compiler generates bytecode in- 
structions that are nonspecific to a particular computer 
architecture. A bytecode is a machine independent code 
generated by the Java compiler and executed by a Java 

10 interpreter. A Java interpreter is part of the JVM that al- 
ternately decodes and interprets a bytecode or byte- 
codes. These bytecode instructions are designed to be 
easy to interpret on any computer and easily translated 
on the ffy into native machine code. 

15 [0038] A JVM must load class files and execute the 
bytecodes within them. The JVM contains a class load- 
er, which loads class files from an application and the 
class files from the Java application programming inter- 
faces (APIs) which are needed by the application. The 

20 execution engine that executes the bytecodes may vary 
across platforms and implementations. 
[0039] One type of software-based execution engine 
is a just-in-time (JIT) compiler. With this type of execu- 
tion, the bytecodes of a method are compiled to native 

25 machine code upon successful fulfilment of some type 
of criteria for "jitting" a method. The native machine code 
for the method is then cached and reused upon the next 
invocation of the method. The execution engine may al- 
so be implemented in hardware and embedded on a 

30 chip so that the Java bytecodes are executed natively. 
JVMs usually interpret bytecodes, but JVMs may also 
use other techniques, such as just-in-time compiling, to 
execute bytecodes. 

[0040] Interpreting code provides an additional bene- 

35 fit. Rather than instrumenting the Java source code, the 
interpreter may be instrumented. Trace data may be 
generated via selected events and timers through the 
instrumented interpreter without modifying the source 
code. Performance trace instrumentation is discussed 

40 in more detail further below. 

[0041] When an application is executed on a JVM that 
is implemented in software on a platform-specific oper- 
ating system, a Java application may interact with the 
host operating system by invoking native methods. A 

45 Java method is written in the Java language, compiled 
to bytecodes, and stored in class files. A native method 
is written in some other language and compiled to the 
native machine code of a particular processor. Native 
methods are stored in a dynamically linked library 

so whose exact form is platform specific. 

[0042] With reference now to Figure 3B, a block dia- 
gram of a JVM is depicted in accordance with a pre- 
ferred embodiment of the present invention. JVM 350 
includes a class loader subsystem 352, which is a mech- 

55 anism for loading types, such as classes and interfaces, 
given fully qualified names. JVM 350 also contains runt- 
ime data areas 354, execution engine 356, native meth- 
od interface 358, and memory management 374. Exe- 
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cution engine 356 is a mechanism for executing instruc- 
tions contained in the methods of classes loaded by 
class loader subsystem 352. Execution engine 356 may 
be, for example, Java interpreter 362 or just-in-time 
compiler 360. Native method interface 358 allows ac- 
cess to resources in the underlying operating system. 
Native method interface 358 may be, for example, a 
Java native interface. 

[0043] Runtime data areas 354 contain native method 
stacks 364, Java stacks 366, PC registers 368, method 
area 370, and heap 372. These different data areas rep- 
resent the organization of memory needed by JVM 350 
to execute a program. 

[0044] Java stacks 366 are used to store the state of 
Java method invocations. When a new thread is 
launched, the JVM creates a new Java stack for the 
thread. The JVM performs only two operations directly 
on Java stacks: it pushes and pops frames. A thread's 
Java stack stores the state of Java method invocations 
for the thread. The state of a Java method invocation 
includes its local variables, the parameters with which it 
was invoked, its return value, if any, and intermediate 
calculations. Java stacks are composed of stack 
frames. A stack frame contains the state of a single Java 
method invocation. When a thread invokes a method, 
the JVM pushes a new frame onto the Java stack of the 
thread. When the method completes, the JVM pops the 
frame for that method and discards it. 
[0045] The JVM does not have any registers for hold- 
ing intermediate values; any Java instruction that re- 
quires or produces an intermediate value uses the stack 
for holding the intermediate values. In this manner, the 
Java instruction set is well-defined for a variety of plat- 
form architectures. 

[0046] PC registers 368 are used to indicate the next 
instruction to be executed. Each instantiated thread gets 
its own pc register (program counter) and Java stack. If 
the thread is executing a JVM method, the value of the 
pc register indicates the next instruction to execute. If 
the thread is executing a native method, then the con- 
tents of the pc register are undefined. 
[0047] Native method stacks 364 store the state of in- 
vocations of native methods. The state of native method 
invocations is stored in an implementation-dependent 
way in native method stacks, registers, or other imple- 
mentation-dependent memory areas. In some JVM im- 
plementations, native method stacks 364 and Java 
stacks 366 are combined. 

[0048] Method area 370 contains class data while 
heap 372 contains all instantiated objects. The JVM 
specification strictly defines data types and operations. 
Most JVMs choose to have one method area and one 
heap, each of which are shared by all threads running 
inside the JVM. When the JVM loads a class file, it pars- 
es information about a type from the binary data con- 
tained in the class file. It places this type information into 
the method area. Each time a class instance or array is 
created, the memory for the new object is allocated from 



heap 372. JVM 350 includes an instruction that allocates 
memory space within the memory for heap 372 but in- 
cludes no instruction for freeing that space within the 
memory. 

5 [0049] Memory management 374 in the depicted ex- 
ample manages memory space within the memory allo- 
cated to heap 370. Memory management 374 may in- 
clude a garbage collector which automatically reclaims 
memory used by objects that are no longer referenced. 
io Additionally, a garbage collector also may move objects 
to reduce heap fragmentation. 
[0050] The present invention is equally applicable to 
either a platform specific environment, that is, a tradi- 
tional computer application environment loading mod- 
's ules or native methods, or a platform independent envi- 
ronment, such as an interpretative environment, for ex- 
ample, a Java environment loading classes, methods 
and the like. For purposes of explanation of the features 
and advantages of the present invention and to accen- 
20 tuate the ability of the present invention to operate in 
either environment, examples of the operation of the 
present invention will be described in terms of both a 
Java environment and a traditional computer operating 
environment. 

25 [0051] The present invention provides a mechanism 
by which a merged file of the symbolic data is generated. 
The present invention also provides a mechanism by 
which performance traces of applications, such as Java 
applications, and symbolic resolution can be performed 

30 in which the symbolic data is verified as being the correct 
symbolic data for incremental or on-demand resolution 
of addresses, such as with a performance trace data. In 
addition, the present invention provides a mechanism 
by which an indexed database of symbolic data is gen- 

35 erated as either a separate file or as a separate section 
of a trace file. While the present invention is applicable 
to any incremental or on-demand resolution of symbolic 
information, the present invention will be explained in 
terms of a performance trace of a computer program for 

40 illustrative purposes. 

[0052] With reference now to Figure 4, a block dia- 
gram depicts components used to perform performance 
traces of processes in a data processing system . A trace 
program 400 is used to profile processes 402. Trace pro- 

45 gram 400 may be used to record data upon the execu- 
tion of a hook, which is a specialised piece of code at a 
specific location in a routine or program in which other 
routines may be connected. Trace hooks are typically 
inserted for the purpose of debugging, performance 

50 analysis, or enhancing functionality. These trace hooks 
are employed to send trace data to trace program 400, 
which stores the trace data in buffer 404. The trace data 
in buffer 404 may be subsequently stored in a file for 
post-processing, or the trace data may be processed In 

55 real-time. The trace data in either the buffer 404 or the 
trace file, is then processed by the post-processor 406 
to generate an indexed database of symbolic data for 
loaded modules, as described more fully hereafter. 
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[0053] In a non-Java environment, the present inven- 
tion employs trace hooks that aid in the identification of 
modules that are used in an application under trace. 
With Java operating systems, the present invention em- 
ploys trace hooks that aid in identifying loaded classes i 
and methods. 

[0054] In addition, since classes and modules may be 
loaded and unloaded, these changes may also be iden- 
tified using trace data. This is especially relevant with 
"network client" data processing systems, such as those 1 
that may operate under Java OS, since classes and jit- 
ted methods may be loaded and unloaded more fre- 
quently due to the constrained memory and role as a 
network client. Note that class or module load and un- 
load information is also relevant in embedded applica- 1 
tion environments, which tend to be memory con- 
strained. 

[0055] With reference now to Figure 5, a diagram de- 
picts various phases in performing a performance trace 
of the workload running on a system. Subjectto memory 21 
constraints, the generated trace output may be as long 
and as detailed as the analyst requires for the purpose 
of profiling a particular program. 
[0056] An initialization phase 500 is used to capture 
the state of the client machine at the time tracing is ini- 2t 
tiated. This trace initialization data includes trace 
records that identify ail existing threads, all loaded class- 
es (modules), and ail methods (sections) for the loaded 
classes (modules). Records from trace data captured 
from hooks are written to indicate thread switches, in- 3c 
terrupts, and loading and unloading of classes (mod- 
ules) and "jilted" methods (sections). 
[0057] Any class (module) which is loaded has trace 
records that indicate the name of the class (module) and 
its methods (sections). In the depicted example, four 35 
byte IDs are used as identifiers for threads, classes, and 
methods. These IDs are associated with names that 
have been output in the trace records. A trace record is 
written to indicate when all of the start up information 
has been written. 40 
[0058] Next, during the profiling phase 502, trace 
records are written to a trace buffer or trace file. In the 
present invention, a trace buffer may have a combina- 
tion of types of records, such as those that may originate 
from a trace hook executed in response to a particular 45 
type of event, e.g., a method entry or method exit, and 
those that may originate from a stack walking function 
executed in response to a timer interrupt, e.g., a stack 
unwind record, also called a call stack record. 
[0059] For example, the following operations may oc- so 
cur during the profiling phase if the user of the profiling 
utility has requested sample-based profiling informa- 
tion. Each time a particular type of timer interrupt occurs, 
a trace record is written, which indicates the system pro- 
gram counter. This system program counter may be 55 
used to identify the routine that is interrupted. In the de- 
picted example, a timer interrupt is used to initiate gath- 
ering of trace data. Of course, other types of interrupts 



may be used other than timer interrupts. Interrupts 
based on a programmed performance monitor event or 
other types of periodic events may be employed, for ex- 
ample. 

5 [0060] In the post-processing phase 504, the data col- 
lected in the trace buffer is processed or sent to a trace 
file for post-processing. In one configuration, the file 
may be sent to a server, which determines the profile for 
the processes on the client machine. Of course, de- 
10 pending on available resources, the post-processing al- 
so may be performed on the client machine. 
[0061 ] With the present invention, in accordance with 
a first exemplary embodiment, the post-processing con- 
sists of utilising a merged symbol file to correlate sym- 
15 bolic data with performance trace data, that is, to per- 
form symbolic resolution. This may be done with either 
the performance trace data stored in the trace buffer or 
the performance trace data in the trace file. The post- 
processing may be performed as an incorporated oper- 
20 ation such that the post-processing is performed imme- 
diately after the performance trace is performed, during 
the performance trace in real time, or at a time remote 
from the time that the performance trace is performed. 
[0062] As part of the symbolic resolution process, the 
25 symbolic data for the modules/processes is verified as 
being the correct symbolic data for the versions of the 
modules/processes in the performance trace data. This 
verification is based on various criteria including check- 
sum, timestamp, fully qualified path, segment sizes, and 
30 the like. 

[0063] The symbolic resolution provides symbolic da- 
ta for loaded modules/processes of the application un- 
der trace. As a result of the symbolic resolution, an in- 
dexed database of the symbolic data for the loaded 
35 modules/processes is generated. The indexed data- 
base may be based on the performance trace data in 
the trace buffer or the performance trace data in the 
trace file, as will be described in more detail hereafter. 
[0064] Figure 6 is an exemplary diagram illustrating 
to the time relationship of the various processes employed 
during a performance trace of an application and sub- 
sequent generation of an indexed database for loaded 
modules/processes. Figure 6 assumes that the post- 
processing of the performance trace data is performed 
*s at some time after the performance trace is completed. 
However, as noted above, the post-processing may also 
be performed during the performance trace such that, 
as the performance trace data is written to the trace buff- 
er, the post-processing is performed on the written per- 
» formance trace data. In this way, the amount of time nec- 
essary to complete the performance trace and post- 
processing is reduced. 

[0065] As shown in Figure 6, the performance trace 
is initiated at time t<, when the application execution is 
; 5 started. The performance trace ends at time tj when the 
application execution is ended. 
[0066] Subsequent to the performance trace, at time 
tg, a merged symbol file of the symbolic data for the ap- 
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plication under trace is generated. While Figure 6 
shows the generation of the merged symbol file being 
performed after the application trace is completed, the 
invention is not limited to such an embodiment. Rather, 
the merged symbol file may be generated before the 5 
performance trace is initiated or as part of trace finali- 
zation. An alternate embodiment may perform symbolic 
resolution in real-time (during the trace) for concurrent 
display of trace information. 

[0067] At some time t„ subsequent to the perform- 
ance trace and the generation of the merged symbol file, 
the loaded or used modules/processes during the per- 
formance trace are determined and an indexed data- 
base of the symbolic data for the loaded or used mod- 
ules/processes is generated. This indexed database 
may be generated as a post-processing of the perform- 
ance trace data in the trace buffer immediately after the 
performance trace is ended. Alternatively, the indexed 
database may be generated as a post-processing of the 
performance trace data stored in a trace file at some 
time remote from the actual performance trace. 
[0068] With reference now to Figure 7, a flowchart de- 
picts an exemplary operation of a performance trace tool 
for generating trace records from modules/processes 
executing on a data processing system. Trace records 
may be produced by the execution of small pieces of 
code called "hooks". Hooks may be inserted in various 
ways into the code executed by processes, including 
statically (source code) and dynamically (through mod- 
ification of a loaded executable). The operation depicted 
in Figure 7 is employed after trace hooks have already 
been inserted into the process or processes of interest. 
The operation begins by allocating a buffer (step 700), 
such as buffer 404 in Figure 4. Next, in the depicted 
example, trace hooks are turned on (step 702), and trac- 
ing of the processes on the system begins (step 704). 
Trace data is received from the processes of interest 
(step 706). This type of tracing may be performed during 
phases 500 and/or 502, for example. This trace data is 
stored as trace records in the buffer (step 708). 
[0069] A determination is made as to whether tracing 
has finished (step 710). Tracing finishes when the trace 
buffer has been filled or the user stops tracing via a com- 
mand and requests that the buffer contents be sent to 
file. If tracing has not finished, the operation returns to 
step 706 as described above. Otherwise, when tracing 
is finished, the buffer contents are sent to a file for post- 
processing (step 712). A report is then generated in 
post-processing (step 714) with the operation terminat- 
ing thereafter. 

[0070] Although the depicted example uses post- 
processing to analyze the trace records, the operations 
of the present invention may be used to process trace 
data in real-time depending on the implementation. If the 
trace data is processed in real-time, the processing of 
the trace data in the trace buffer would begin immedi- 
ately after step 710 above. By processing the trace data 
in real-time, the dynamic state of the system may be 



identified. By processing the trace data in real-time, pro- 
filer reports may be displayed concurrently with program 
execution. 

[0071] This approach is especially useful for jitted 
methods. A jitted method is converted from bytecodes 
to machine code just before the program is run. In the 
case of Java, jitted methods are converted from byte- 
code to native code. Thus, the dynamic nature of jitted 
methods may be accommodated by processing trace 
data dynamically. With reference now to. Figure 8, a. 
flowchart depicts an exemplary operation that may be 
used during an interrupt handler trace hook. The oper- 
ation begins by obtaining a program counter (step 800). 
Typically, the program counter is available in one of the 
saved program stack areas. Thereafter, a determination 
is made as to whether the code being interrupted is in- 
terpreted code (step 802). This determination may be 
made by determining whether the program counter is 
within an address range for the interpreter used to inter- 
pret bytecodes. 

[0072] If the code being interrupted is interpreted, a 
method block address is obtained for the code being in- 
terpreted. A trace record is then written (step 806). The 
trace record is written by sending the trace data to a 
trace program, such as trace program 400, which gen- 
erates trace records for post-processing in the depicted 
example. This trace record is referred to as an interrupt 
record, or an interrupt hook. 

[0073] This type of trace may be performed during 
phase 502. Alternatively, a similar process, that is, de- 
termining whether code that was interrupted is interpret- 
ed code, may occur during post-processing of a trace 
file. In this case, the last interpreted method being exe- 
cuted is always written as part of the trace record. 
[0074] As described above, either before, during or af- 
ter the performance trace is performed, a merged sym- 
bol file of the symbolic data for the application under 
trace is generated. Figure 9 is a graphical depiction of 
the generation of the merged symbol file according to 
the present Invention for a traditional computer execu- 
tion environment. 

[0075] As shown in Figure 9, the merged symbol file 
910 is comprised of symbolic data for modules obtained 
from map files 920, debug files 930, non-stripped ver- 
sions of modules 930, and other symbolic data files 940. 
These sources of symbolic data may be stored, for ex- 
ample, in local memory 209, hard disk 232, one or more 
of the devices 276-282, or any other type of data storage 
device. The merged symbol file 910 may likewise, be 
stored in any of these storage devices of the like. 
[0076] The data processing system of the present in- 
vention is provided with the fully qualified path of the 
various sources of symbolic data and combines symbol- 
ic information describing various executable files into a 
single, merged symbol file. An exemplary embodiment 
of the format of this file is described in Figure 10A. 
[0077] The resulting merged symbol file has one entry 
(represented abstractly by a HeaderData entry in the 
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merged symbol file) for each modu le. There may be mul- 
tiple entries for modules with the same name if, for in- 
stance, multiple versions of a module exist on the sys- 
tem or if there are distinct modules with identical names 
in different paths on the system. 
[0078] Figure 10A is an exemplary diagram illustrat- 
ing the organization of a merged symboJ file in accord- 
ance with the present invention. As shown in Figure 
10A, the merged symbol file is organized in a hierarchi- 
cal manner. At the top of the hierarchy is information 
1001 identifying the particular platform on which the ap- 
plication is located. This information includes a header, 
a case sensitivity flag, slash character, and the like. 
[0079] At the next level of the hierarchy the merged 
elements 1002 are identified. The merged elements in- 
clude n number of modules that are identified by their 
module name, that is the base name without extensions 
of the particular modules in the application. 
[0080] Each merged element may represent 1 to n 
distinct modules that happen to have the same base 
name. Thus, for example, during creation of the merged 
symbol file, if an executable, foo.exe, is encountered 
and a corresponding debug file, foo.dbg, is also encoun- 
tered, the symbolic data from both of these files is 
merged into a single image (described by a single data 
element 1002). If, however, an executable, foo.exe, and 
a debug file with the same base name, foo.dbg, are en- 
countered but it is determined that these do not corre- 
spond to the same module (for example, if they contain 
different checksum or timestamp, possibly indicating 
that they correspond to different versions of the mod- 
ule), then two distinct images of the modules (represent- 
ed by distinct data elements 1002} are created with dis- 
tinct symbolic information. 

[0081] These images of the module are identified by 
module headers 1003 that include the module path, ex- 
tension, checksum, and timestamp. Each image of the 
module may contain 1 to n sections, each representing 
a collection of routines, a collection of writable data el- 
ements or read only data elements, and the like. 
[0082] These sections are identified by a section 
header 1004 that contains the section name, offset, and 
length. Each section may contain 1 to n symbolic data 
1005. The symbolic data 1005 is identified by the sym- 
bolic name, offset from the top of the module and/or a 
length. 

[0083] Figure 10B is an example illustration of a 
merged symbol file in accordance with the present in- 
vention. Figure 10B assumes a non-Java environment 
and is directed to particular modules of an application. 
However, the present invention, as noted above, is 
equally applicable to a Java environment. 
[0084] As shown in Figure 1 0B, the merge symbol file 
1000 includes a mergesym header 1010, a merged el- 
ement identifier 1020, and a module name 1030. The 
mergesym header 1010, the merged element identifier 
1020 and the module name 1030 store information 
about how the merged symbol file 1000 was generated. 



In addition, these elements store information about the 
system on which the file was generated (such as the 
number of processors or the operating system in use). 
The merged element identifier 1020 forms a top level 
5 index into the merged symbol file 1 000 by base name. 
[0085] The merged symbol file further includes infor- 
mation pertaining to each module having the module 
name. Thus, in the example shown in Figure 10B, two 
modules having the module name "too" are present in 
10 the merged symbol file. Entries 1 040 and 1 050 for each 
of the modules is provided in the merged symbol file. 
[0086] Each entry 1040 and 1050 provides informa- 
tion 1060 pertaining to the identification of a particular 
module and the symbolic data 1070 associated with the 
module. The symbolic data is divided into loadable sec- 
tions having section headers. Each loadable section has 
a section name, offset and length. 
[0087] The information 1 060 pertaining to the identi- 
fication of a particular module includes such information 
20 as the fully qualified path of the module, the module ex- 
tension, a checksum, and timestamp for the module. 
The symbolic data provides the symbol name, offset and 
length for each symbol. By using the offset and the 
length associated with the section and the symbolic da- 
25 ta, the exact identity of the symbolic data can be deter- 
mined and correlated with addresses in performance 
trace data. 

[0088] In addition to the above, the merged symbol 
file may include a "confidence" measure, or degree of 
30 quality of the symbols, for each module. The confidence 
measure may be, for example, an indicator of the types 
of symbolic data that were obtained during the merge 
process. For example, the confidence measure may 
provide an indication of whether all the exportable sym- 
35 bols, internal symbols and static symbols have been ob- 
tained for the particular module. This confidence meas- 
ure maybe reported to a user for their use in determining 
the quality of the symbolic resolution in accordance with 
the present invention. 
40 [0089] While the modules shown in Figure 1 0B have 
the same module name, they are different modules as 
is clear from the module information stored in the 
merged symbol file. The entries 1040 and 1050 repre- 
sent different modules in that the path, checksum, 
45 timestamp, length, and symbolic data are different for 
the two modules. The modules themselves may be two 
different versions of the same module, however. For 
example, a later version of the "foo.exe" module in the 
"C:\tempV directory may have been created and stored 
50 in the directory "C:\WINN7Y M 

[0090] When the checksum and the time stamp are 
not available or the fully qualified path name is not used, 
known systems of performance tracing are not capable 
of discerning which of the modules is the correct module 
55 for identifying the symbolic data associated with the per- 
formance trace data. The known systems match based 
on base name and are dependent on the user to make 
sure that the symbols they provide are for the correct 
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versions of the modules. Field 3: 

[0091] For example, Windows 2000, available from Field 4: 
Microsoft Corporation, requires the user to specify the Field 5: 
fully qualified path name to the source file and to the Field 6: 
symbolic information with the exception of some fixed s Field 7: 
conventions, such as the system directory in the Win- Field 8: 
dows operating systems. This directory is identified by 
the System Root environment variable. Thus, a default 
location may be accessed by, for example, the path 
"%SystemRoot%/Symbols/." Thus, if there are more io Field 9: 
than one module with the same module name, either as Field 1 0: 
different modules, or different versions of the same mod- Field 11 : 
ule, an error may occur in that the wrong module Is used Field 12: 
to perform symbolic resolution. 

[0092] Relying solely on the fully qualified path does *5 
not provide a solution to this problem because: 

1 . the fully qualified path may not be available on 
all systems; 

20 

2. sometimes it is convenient to generate symbols 
out of a different directory than the one from which 
the system loads the modules; and 

3. The fully qualified path is not a failsafe criterion 25 
for matching. If the trace is post processed at a time 
remote from collection of the trace information itself, 
then it is possible that a module has been upgraded 
to a more recent version in the mean time. In this 
case, the fully qualified paths would match, but one 30 
would not want to use the symbols from the module 
at that location. 

[0093] The present invention provides a mechanism 
that works even in the case that it is not possible to ob- 35 
tain trace information that contains the fully qualified 
path. In addition, the present invention allows for gen- 
erating symbols out of a different directory than the one 
from which the system loads the modules. For example, 
the present invention allows for post processing of trace 40 
information and generation of merged symbol files on a 
system that is not the system under test. Furthermore, 
the present invention provides a mechanism by which 
the correct symbolic data is matched with the perform- 
ance trace data. The mechanism makes use of a 45 
number of checks to determine if the module identified 
in the merged symbol file is the same module as in the 
performance trace data. 

[0094] Figure 1 1 is an exemplary diagram of perform- 
ance trace data. The performance trace data 1100 in so 
Figure 1 1 may be maintained in the trace buffer or may 
be written to a trace file following the performance trace. 
The trace file may be stored, for example, in any of the 
storage devices 209, 232, 276-282, or the like. The per- 
formance trace data includes the following fields: 55 



Timestamp (upper 32 bits: lower 32 bits); 
Not used 

Process identification (pid); 
Segment load Address; 
Segment length; 

Segment Flags (These are flags that indi- 
cate permission levels on the pages into 
which the segment gets loaded and the 
like); 

Module checksum; 
Module timestamp; 
Segment name; and 
Module name. 



Field 1 : Trace hook major code; 
Field 2: Trace hook minor code; 



[0095] The performance trace data 1100 includes 
performance trace data for Module Table Entry (MTE) 
trace hooks as well as time profiler (Tprof) trace hooks. 
[0096] The fields for MTE trace hooks in the trace file 
are described above. The MTE trace data is provided in 
the entries having a trace hook major code of 1 9 and a 
minor code of 38. The trace hook major and minor codes 
19 and 38 are the major and minor codes that are used 
in the exemplary embodiment to indicate an MTE hook. 
Other codes may be used without departing from the 
spirit and scope of the present invention. 
[0097] For a Tprof trace hook (major code 1 0 and mi- 
nor code 03), the fields will be slightly different in that 
field 5 will correspond to a program counter, field 6 will 
correspond to a pid, field 7 will correspond to a thread 
id, field 8 will correspond to a code privilege level. The 
code privilege level indicates the privileges that the ex- 
ecuting code has. For example, the code privileges level 
may indicate whether the executing code is in user 
space or kernel space. 

[0098] The tprof hooks contain the trace data that is 
used to profile the system under test. At postprocessing 
time, the pid and address combinations in the tprof 
hooks are resolved into symbols. The post processor 
combines the MTE information and the merged symbol 
file into an indexed database. When the post processor 
encounters a tprof hook (or any other type of trace data 
that contains address information which needs to be 
translated into a symbol) the post processor looks-up 
the pid-address combination in the database to get a 
corresponding symbol. 

[0099] The MTE information includes an entry repre- 
senting the loading or unloading of each section in a 
module. Thus, there is a separate entry for loading the . 
text section, loading the PAGE section, and unloading 
the .text section (if each of these operations did occur) 
of C:\WINNT\foo.exe. In the depicted example, the load- 
ing of these sections is shown in the lines starting with 
"1 9 38." Examples of entries for unloading are shown in 
the lines starting with "1 9 39" and "1 9 44." The unloading 
entries starting with "19 39" correspond to a standard 
unloading hook. The unloading entries starting with "19 
44" correspond to an unloading hook for a jitted method. 
[0100] The MTE hook trace data in the performance 
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trace data may be stored as an MTE file. Figure 12 pro- 
vides an exemplary diagram of an MTE file 1200. As 
shown in Figure 12, the MTE file 1200 contains only the 
MTE entries in the performance trace data and thus, on- 
ly identifies the loading and unloading of modules. 
[0101] In a preferred embodiment of the present in- 
vention, the MTE file 1200 is correlated with the merged 
symbol file to identify the symbolic data for the loaded 
modules. However, the correlation of performance trace 
data with the merged symbol file may be performed 
based on the MTE entries in the performance trace data 
in the trace buffer or the trace file, such as the perform- 
ance trace data shown in Figure 11. 
[0102] In order to verify that the merged symbol file 
information for the module corresponds to the same 
module identified in the MTE file, a number of compar- 
isons are made. First, a comparison of the checksum 
and timestamp for the module is made. If the checksum 
and timestamp indicated in the merged symbol file cor- 
responds to the checksum and timestamp in the MTE 
file, then the module identifiers are determined to corre- 
spond and the symbolic data in the merged symbol file 
is used with the MTE file information to generate loaded 
module information. 

[0103] Some files do not contain checksum and 
timestamp information. For example, Elf object files 
used in Linux do not contain checksum and timestamp 
information nor do map files. Thus, for these files, the 
checksum and timestamp check will normally have a 
negative result. However, with the map files, for exam- 
ple, other related files, such as .dbg files, can be used 
in conjunction with the map files to provide necessary 
information for checking the validity of the map files. If 
the checksum and timestamp do not match or are not 
available, the fully qualified path identified in the MTE 
file is matched with the fully qualified path in the merged 
symbol file. If there is a match, the module is verified 
and the symbolic data in the merged symbol file corre- 
sponding to the verified module entry is used to gener- 
ate loaded module information. 
J0104] If the fully qualified path does not match, a 
comparison of segment sizes is made between the 
merged symbol file and the MTE file. Thus, for example, 
the segment length in field 7 of the MTE file, is compared 
to the segment length for the segment, in the merged 
symbol file, of the module identified in field 1 1 of the MTE 
file. If the segment length corresponds, then that seg- 
ment is "matched." When all the segments are matched, 
the module is verified and the symbolic data in the 
merged symbol file is used to generate loaded module 
information. 

[01 05] This series of comparisons may be performed 
for each module in the merged symbol file having the 
appropriate module name. Thus, for example, the above 
comparisons are performed for the first "too* module 
(Module Header(O)) and if there is no match, then for 
the second "foo" module (Module Header(1)). 
[0106] In an alternative embodiment, each compari- 



son may be made regardless of whether a previous 
comparison resulted in a verified module. Thus, for ex- 
ample, the checksum, timestamp, fully qualified path, 
and segment sizes are compared for each of the "foo" 
s modules and the one with the best correlation is chosen 
as the right module to be used for generating loaded 
module information. For example, if the first "foo" mod- 
ule was verified based on the segment sizes and the 
second "foo" module were verified based on the fully 
io qualified path, since the fully qualified path has a greater 
probability of identifying the correct module entry, the 
second "foo" module is chosen to generate loaded mod- 
ule information. 

[0107] Once a module is verified, an indexed data- 
's base entry is created based on the verified module sym- 
bolic data. This operation is performed for each MTE 
entry in the performance trace file or MTE file. 
[0108] The indexed database entries may be indexed 
based on any searchable value. In a preferred embod- 
20 iment, the indexed database is indexed based on the 
process identifier (pid) and the segment load address, 
however, other searchable indices may be used without 
departing from the spirit and scope of the present inven- 
tion. 

25 [01 09] During post-processing, as the post-processor 
encounters an MTE entry in the performance trace file 
or MTE file, depending on the particular implementation, 
the segment is matched to a corresponding segment in 
the merged symbol file, as described above. As the MTE 
30 entry is processed, an indexed database entry is creat- 
ed with the pid and segment load address obtained from 
the performance trace file and the segment name as ob- 
tained from the merged symbol file. 
[01 1 0] Figure 1 3A is an exemplary extracted portion 
35 of an example of a simplified indexed database 1300 
according to the present invention. As shown in Figure 
13A, entries in the indexed database 1300 include an 
index 1310 (pid:address) and corresponding symbolic 
data 1320, i.e. the subroutine names. Thus, when apar- 
40 ticular pidiaddress is encountered in the performance 
trace file, the pid:address may be converted into a par- 
ticular symbolic location of a particular location within 
an executable file. The symbol itself corresponds to a 
subroutine (or java method). 
45 [0111] A segment usually contains multiple subrou- 
tines. Thus, for example, if a tprof record is encountered 
with pid 2 and address 80298000, it would get resolved 
to 1 8000 bytes beyond the beginning of subroutine 2 in 
the version of foo.exe in the directory C:\\temp\. This can 
50 be represented as: C:\VtempVoo.exe 
(subroutine2+0x1 8000). 

[0112] As mentioned above, the indexed database 
1300 is obtained through a process of matching pid.ad- 
dress combinations obtained from MTE file data, such 
55 as MTE file 1 200, with section data in the merged sym- 
bol file, such as merged symbol file 1000. Figure 13B 
is a flowchart outlining an exemplary operation of a post- 
processor for generating the indexed database 1300 
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based on the MTE data and the merged symbol Hie. As 
shown in Figure 13B, the operation starts with the post- 
processor encountering a MTE hook in the MTE data 
(step 1310). The MTE data identifies a pid and address. 
The pid and address are used by the post-processor to 
identify a module and section within the module in the 
merged symbol file (step 1320). 
[01 1 3] The post-processor then computes an offset of 
the address from the top of the module containing the 
section (step 1 330). This offset is used by the post-proc- 
essor to identify the symbolic data for the symbol (step 
1340). The resulting symbolic data is stored in the in- 
dexed database in association with the pid.address 
(step 1350). The indexed database 1300 may be stored 
as a separate file on a storage medium, such as hard 
disk 232 or disk 276, in memory, such as local memory 
209 or memory 274, or may be stored as a separate part 
of the performance trace file when the performance 
trace file is written to a storage medium. For example, 
the indexed database 1 300 may be stored at the end of 
the performance trace file such that, when performance 
analysis is performed by a user, the performance trace 
file information may be used to identify the particular 
modules and segments of the application that were 
loaded. 

[0114] In this way, a user of the performance trace 
tools of the present invention may perform analysis of 
the performance of an application by identifying the par- 
ticular loaded modules and segments of the application. 
In addition, the user may identify the amount of comput- 
ing time used by particular modules and segments to 
identify portions of the application that may be optimized 
f orthe particular platform on which the application is run- 
ning. 

[0115] Figure 14 is a flowchart outlining an exemplary 
operation of the data processing system according to 
the present invention when generating an indexed da- 
tabase of symbolic data based on a performance trace 
of a computer program, that is an application. While the 
flowchart shows a particular order to the steps, no order 
is meant to be implied. Rather, many of the steps may 
be performed at different times during the operation of 
the data processing system, such as the capturing of 
symbolic data and storing the symbolic data in a merged 
symbol file, which may be performed before, during, or 
after the execution of a trace. 
[0116] As shown in Figure 14, a trace of the computer 
program is executed (step 1410) and a trace file is gen- 
erated (step 1420). As described above, this trace file 
may be resident in the trace buffer or may be written to 
a storage device. 

[0117] Loaded module information is generated and 
stored (step 1430). This may be done, for example, by 
generating the MTE file that identifies only the loading 
and unloading of module segments, as described 
above. The symbolic data for the computer program is 
captured (step 1 440) and stored in a merged symbol file 
(step 1450). 



[01 1 8] The symbolic data may be captured based on 
a user identifying the location of files containing the sym- 
bolic data. Alternatively, the capturing of the symbolic 
data may be based on files having the same file name 
5 as the computer program under trace or stored in a pre- 
defined directory. 

[01 1 9] The merged symbol file is then combined with 
the loaded module information to generate loaded mod- 
ule symbolic data (step 1460). This combination may in- 
to elude the comparisons and verification of modules de- 
scribed above. The loaded module symbolic data is then 
indexed and stored as an indexed database file (step 
1470). The indexed database file may be stored in mem- 
ory, as a separate file written to a storage device, or as 

15 a separate section of the performance trace file written 
to a storage device, as described above. 
[0120] The flowchart in Figure 14 describes the op- 
eration of the present invention with the use of a merged 
symbol file to supply symbolic data, however, the 

20 present invention is not limited to the use of a merged 
symbol file. Rather, any source of symbolic data that 
may be verified may be used without departing from the 
spirit and scope of the present invention. 
[0121] Figure15 is a flowchart outlining an exemplary 

25 operation of the data processing system of the present 
invention when dynamically generating an indexed da- 
tabase of symbolic data, based on performance trace 
data stored in the trace buffer, that is stored as a sepa- 
rate section of the performance trace file. As with Figure 

30 14, while the flowchart shows a particular order of steps, 
no order is meant to be implied and many of the steps 
may be performed in different orders than that shown. 
[01 22] The steps shown in Figure 15 are repeated for 
new performance trace data written to the trace buffer. 

35 in this way, an indexed database of symbolic data is dy- 
namically created as the application is under trace. 
[0123] As shown in Figure 15, the operation starts 
with a performance trace of the computer program being 
performed (step 1510) and a trace file being generated 

40 (step 1 520). The trace file is searched for loaded module 
entries (step 1530) and symbolic data for the loaded 
modules is obtained (step 1540). The symbolic data is 
preferably obtained from a merged symbol file as de- 
scribed above, however, any source of symbolic data 

45 that may be verified may be used without departing from 
the spirit and scope of the present invention. 
[0124] Once the symbolic data is obtained for the 
loaded modules, the symbolic data is stored as a sepa- 
rate section of the trace file containing only the symbolic 

so data for the loaded modules (step 1550). This symbolic 
data is then indexed to generate an indexed database 
of symbolic data for the loaded modules as a separate 
section of the trace file (step 1560). 
[0125] Thus, using either operation described above, 

55 an indexed database of symbolic data for loaded mod- 
ules is obtained. This indexed database, in a preferred 
embodiment, is obtained by gathering symbolic data 
from a plurality of sources into a merged symbol file and 



13 



25 



EP 1 172 729 A2 



26 



then comparing this merged symbol file with perform- 
ance trace data that is stored in either the trace buffer 
or in a trace file on a storage device. Matching symbolic 
data is then written to an indexed database in corre- 
spondence with the performance trace data. 
[0126] Figure 16 is a flowchart outlining an operation 
of the present invention when comparing the merged 
symbol file with the performance trace data in order to 
verify the module symbolic data. While Figure 16 shows 
a particular order to the steps, many of the steps may 
be performed in different orders. Thus, for example, the 
segment size verification may be performed before the 
fully qualified path verification, and the like. 
[0127] As shown in Figure 16, the operation starts 
with a verification of the checksum and timestamp for 
the symbolic data stored in the merged symbol file and 
the performance trace data (step 1610). It is then deter- 
mined if there is a match of the merged symbol file sym- 
bolic data and the performance trace data (step 1620). 
If there is a match, the operation continues to step 1670, 
otherwise, a determination is made as to whether the 
symbolic data is from an executable module (step 
1 630) . Th is determination may be made by, for example, 
determining if the extension of the module as provided 
in the merged symbol file is ".exe". 
[01 28] If the symbolic data is not from an executable, 
the operation continues to step 1660, otherwise, a ver- 
ification of the fully qualified path of the module is per- 
formed (step 1640). A determination is made as to 
whether the fully qualified path verification indicates that 
the module symbolic data in the merged symbol file 
matches the performance trace data (step 1650). If 
there is a match, the operation continues to step 1670, 
otherwise, the segment size is verified (step 1 660). 
[0129] A determination is made as to whether the 
module has been verified through one of the above 
checks (step 1 670). If not, an error message is returned 
(step 1680). If the module has been verified, the sym- 
bolic data for the module in the merged symbol file is 
matched to the performance trace data (step 1 690) and 
the operation ends. 

[01 30] As described above, the verification of symbol- 
ic data for a module with the performance trace data 
may be based on the first matching module entry in the 
merged symbol file or may involve a "best match" deter- 
mination of the symbolic data for the module. This "best 
match" determination may involve determining a match 
for each module entry in the merged symbol file for a 
particular module name and identifying which module 
entry is a best match. The best match may be deter- 
mined based on the particular attributes that are used 
to establish the match. 

[0131] Thus, the attributes may be prioritized to pro- 
vide a means for determining the best match. As an ex- 
ample, checksum and timestamp may have a highest 
priority, fully qualified path a second highest priority, and 
segment size a third highest priority. 
[01 32] Figure 17 is a flowchart of an exemplary oper- 



ation of the present invention when determining a best 
match of the symbolic data in the merged symbol file 
with the performance trace data. As shown in Figure 
1 7, the operation starts with verifying a first module entry 
5 in the merged symbol file with the loaded module infor- 
mation in the performance trace data (step 1710). A de- 
termination is made as to whether there is a match of 
the symbolic data with the performance trace data (step 
1720). If not, the next module entry in the merged sym- 
10 bol file is verified (step 1740). If there is a match, a de- 
termination is made as to whether the match is based 
on the checksum and timestamp (step 1730). If the 
match is based on the checksum and timestamp, then 
this is the best match and the operation ends. If the 
» match is not based on checksum and timestamp, the 
next module entry in the merged symbol file is verified 
(step 1740) and a determination is made as to whether 
the next module entry is a better match that the first mod- 
ule entry (step 1750). 
20 [0133] As described above, this may be based on a 
priority scheme set for the particular attributes used to 
verify the module entries. For example, a flag may be 
set indicating a pointer to the module in the merged sym- 
bol file that matched and a number indicating the degree 
25 of confidence in the match. The matching criteria may 
be ranked with checksum and timestamp first, fully qual- 
ified path second, and section lengths third. Thus, a 1 , 
2, or 3 would be recorded to indicate the quality of the 
match. This match is then compared with a subsequent 
30 match and the one with the higher measure of confi- 
dence is retained. This confidence indicator may be 
translated into a message that is reported to a user. 
[0134] Returning to Figure 17, if the next module en- 
try is a better match, the next module entry is selected 
35 as the matching module in the merged symbol file (step 
1 760) and the operation returns to step 1730. If the next 
module is not a better match, a determination is made 
as to whether there are more module entries to verify 
(step 1770). If so, the operation returns to step 1740, 
40 otherwise, the operation ends. 

[01 35] As described above, the present invention pro- 
vides a mechanism by which an indexed database of 
symbolic data for loaded modules is generated. The in- 
dexed database may be used by an analysis tool such 
that the user is presented with a symbolic representation 
of the loaded modules rather than process identifiers 
and addresses that may be more difficult to compre- 
hend. 

[0136] Figure 1 8 is a flowchart outlining an exemplary 
so operation of the present invention when using the in- 
dexed database to provide a symbolic representation of 
performance trace data for analysis by a user. As shown 
in Figure 18, the operation starts with reading the trace 
file (step 1810). The process identifier (pid) and address 
55 information are obtained from the trace file (step 1 820). 
[01 37] The indexed database is then searched for an 
entry corresponding to the pid and address (step 1830). 
A determination is made as to whether there is a match 
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found (step 1 840). If so, the corresponding symbolic da- 
ta is used in accordance with the trace file (step 1850). 
The particular manner in which the symbolic data is 
used will depend on the particular analysis applications 
and/or purpose of these applications. Thereafter, or if 
there is no match, it is determined whether the end of 
the trace file is encountered (step 1860). If not, the op- 
eration returns to step 1810, otherwise, the operation 
ends. 

[0138] Thus, with the present invention, a mechanism 
is provided by which a merged file of symbolic data is 
generated. The present invention also provides a mech- 
anism by which performance traces of applications, 
such as Java applications, and symbolic resolution can 
be performed in which the symbolic data is verified as 
being the correct symbolic data for the performance 
trace data. In addition, the present invention provides a 
mechanism by which an indexed database of symbolic 
data is generated as either a separate file or as a sep- 
arate section of a trace file. 

[0139] The invention as described above is capable 
of providing dynamic representations of the perform- 
ance trace by using MTE file information to identify the 
loading and unloading of modules. In some instances, 
it is preferable to have static representations of the per- 
formance trace at various times during the trace. 
[0140] Currently known post-processing tools make 
use of a single static representation for the symbolic ad- 
dress to name information for the trace of a computer 
program. This static representation is typically generat- 
ed in two part. The first part is the generation of the MTE 
data representing the loaded modules at the start of the 
trace. The second part takes this MTE data and the sym- 
bol for those loaded modules and creates the static rep- 
resentation known by its extension as a .bbf. This MTE 
data typically occurs as part of the start trace (strace) 
initialization. Alternatively, the MTE data may be collect- 
ed at the end of the trace. Getting the MTE data at the 
beginning of the trace does not handle the case where 
modules are loaded during the trace. Getting the MTE 
data at the end of the trace does not handle the case 
where modules are unloaded during the trace or after 
the trace and before the MTE data is collected. 
[0141] The .bbf is a static picture of what modules are 
loaded at a particular time of the trace and the corre- 
sponding symbols of the loaded modules. The .bbf dif- 
fers from the merged symbol file i n that the merged sym- 
bol file contains symbolic information for all of the mod- 
ules of a computer system, the .bbf only contains sym- 
bolic information for loaded modules. The .bbf repre- 
sents a collection of programs and other executable 
code loaded into all processes of the computer system. 
[0142] Figure 19 is an example of a portion of a 
typical .bbf for a computer program. As shown, the .bbf 
has a pid oriented format where the executable methods 
are ordered by address within the pid, the segments are 
ordered by address, and the symbols within the execut- 
able methods are ordered by address within the seg- 



ment. 

[0143] As mentioned above, the .bbf, in known post- 
processing tools, is generated at either the start (strace) 
or the end of the trace of the computer program. Thus, 

5 the only information that the analyst can determine from 
the .bbf is the methods that were loaded at the time the 
trace of the computer program was initiated or at the 
time of termination of the trace. Thus, with the known 
post-processing tools, there is no manner of providing 

10 symbolic information for modules that are loaded and 
unloaded dynamically after strace initialization and be- 
fore termination. 

[01 44] The present invention uses the merged symbol 
file and trace information to generate multiple .bbf files 
15 for determining which modules were loaded or used dur- 
ing the trace. Symbolic resolution may be performed us- 
ing all of the .bbf files such that, if a module is not found 
in one .bbf, it may be found in one of the other .bbf files. 
[0145] In this second exemplary embodiment of the 
20 present invention, the merged symbol file is utilized by 
the post-processor, along with the MTE file information, 
to generate static representations, for example, .bbf 
files, of the trace of the computer program. These static 
representations, in the exemplary embodiment, are cre- 
ss ated at the beginning (strace) and end of the trace. In 
this way, the beginning static representation includes 
the modules loaded when the computer program is ini- 
tialized. The ending static representation identifies the 
modules that were loaded during the trace. From this 
30 information , it is possible to identify modules which were 
loaded at the start of the trace and unloaded. It is also 
possible to identify modules that were dynamically load- 
ed during the trace. 

[0146] The difference between a loaded module and 

35 a used module is that a module may be loaded and nev- 
er used, that is, never referenced by the trace records. 
This occurs when a module is not executed long enough 
to be interrupted by a timer profilertick. Similarly, a mod- 
ule may have been loaded at one point, used, and then 

40 unloaded. By constructing a static representation of the 
trace at the beginning and end of the trace, it can be 
determined which modules that were loaded upon ini- 
tialization, which of these modules were used, which of 
these modules were not used, and which modules were 

^5 loaded during the trace and used or not used. For ex- 
ample, if a module has an entry in the static represen- 
tation generated at the beginning of the trace, but does 
not have an entry in the static representation at the end 
of the trace, it can be determined that the module was 

50 loaded, used and then unloaded. Similarly, if the static 
representation at the end of the trace has an entry for a 
module that does not have an entry in the static repre- 
sentation generated at the beginning of the trace, the 
module must have been loaded during the trace and not 

55 used. 

[0147] The MTE file contains information regarding 
loaded modules. Using the merged symbol file, in the 
manner set forth above with regard to performing sym- 
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bolic resolution to generate an indexed database, sym- 
bolic resolution of address information for the loaded 
modules can be performed. For example, the module 
information in the trace file/trace buffer is used to identify 
modules in the merged symbol file to thereby generate 
an indexed database of symbolic information. This in- 
dexed database of symbolic information may then be 
used along with the MTE file to generate a .bbf file, using 
symbolic offsets from the beginning of the modules, and 
the like, for a particular instance in the trace of the com- 
puter program. The generation of the .bbf file may be 
performed at both the beginning and end of the trace, 
for example. 

[01 48] Thus, using the MTE file and the merged sym- 
bol file, a static representation of the trace of the com- 
puter program can be generated for various times during 
the trace, for example at the beginning and the end of 
the trace. This information can then be stored and used 
to provide symbolic representations of the traced data. 
Because the static representations only represent load- 
ed modules, and because the static representations are 
generated for a finite number of points in time in the 
trace, the amount of information stored for symbolic res- 
olution can be minimized. 

[0149] Thus, with the present invention, during post- 
processing, the post-processor may make use of the 
strace .bbf to perform symbolic resolution of address in- 
formation. If a module cannot be found in the strace . 
bbf, that is, the module was dynamically loaded during 
the trace, the .bbf generated at the end of the trace can 
be used to perform the symbolic resolution. Thus, by 
generating multiple .bbf files during the execution of a 
trace of a computer program, symbolic resolution of dy- 
namically loaded modules may be performed. 
[01 50] From the example of the .bbf shown in Figure 
19 above, it is evident that, when there are many meth- 
ods, there may be a lot of duplicative information stored 
in the .bbf files. For example, if a module has a plurality 
of segments, the module information for the segment 
will be repeated for each process that has the same 
module. 

[0151] The present Invention, in a further embodi- 
ment, eliminates a large majority of this duplicative in- 
formation on systems where the fully qualified path to a 
module is known during tracing by identifying each load- 
ed module by its fully qualified path during the start trace 
(strace) initialization. Figure 20 is an exemplary dia- 
gram of a portion of a .bbf according to the present in- 
vention. 

[0152] As shown in Figure 20, the .bbf is path orient- 
ed. That is, the modules are identified by the fully qual- 
ified path. The .bbf is constructed in a path oriented 
manner that only has one entry for each module. This 
can be done by setting the pid for the module to be a 
wildcard, for example, ■????". This wildcard entry indi- 
cates that the module entry in the .bbf is independent of 
the pid. With modules that are independent of the pid, 
the starting address is set to zero and all addresses for 



the segments and symbolic information are relative ad- 
dresses. That is, the addresses are relative to the start 
address of zero. When the fully qualified path of a mod- 
ule is known, the .bbf is constructed with the "????■ for 
5 the pid. When the fully qualified path of the module is 
not known, the pid is identified in the .bbf. 
[0153] When symbolic resolution is performed using 
the .bbf according to this further embodiment of the 
present invention, the module may be "looked-up" in 
10 the .bbf by its fully qualified path. Thereafter, if there is 
no match based on fully qualified path, the module may 
be "looked-up" based on the pid. Since the pid is set to 
a wildcard for the modules in the .bbf of the present in- 
vention, each module entry in the .bbf will be checked 
to see if there is a match based on the segment size, 
symbolic address information, and the like, in a similar 
manner as set forth above with regard to verification of 
modules using the merged symbols file. 
[0154] Thus, with the present invention, the amount 
& of information stored in the .bbf is minimized while still 
maintaining the ability to search the .bbf for matching 
module entries during symbolic resolution. 
[0155] It is common for an operating system to load 
segments, or sections, of a module piecemeal. Thus, 
25 execution of a particular segment of a module may occur 
prior to all of the segments for the module being loaded. 
Furthermore, some segments of the module may never 
be loaded during the trace or trace records of their hav- 
ing been loaded may not be available. The present in- 
30 vention provides a mechanism by which symbolic reso- 
lution for the segments of a module may be performed 
without requiring the entire module to be loaded or trace 
records for the entire module being available. 
[0156] As mentioned above, and as shown in the 
35 sample trace file and MTE file in Figures 1 1 and 12, the 
present invention may write redundant information to 
the trace data. This redundant information includes, for 
example, the module checksum, module timestamp and 
module fully qualified path. 

[01 57] Because it is not possible to know a priori the 
order in which segments will be loaded, each of the trace 
records contain sufficient information for the post-proc- 
essor to construct an image of the module. This infor- 
mation is used to match the segment in the trace record 
45 to the section of the module in the merged symbol file. 
[01 58] In order to match a segment represented by a 
trace record with a particular section within a module 
represented in the merged symbol file, the following cri- 
teria are considered. If both the segment name in the 
50 trace record and the section name in the merged symbol 
file are not null and they match, then the segment and 
section are a match. If both the segment name and the 
section name are null and there is only one segment in 
the module, then that must be the segment identified in 
55 the trace record. If both the segment name and the sec- 
tion name are null and the addresses match, then the 
segment and section are a match. If both the names are 
null and the sizes in bytes match, then the segment and 
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the section are a match. 

[0159] Once the segment and section are matched, 
the symbolic information can be written to the indexed 
database in the manner described above. Thus, the 
present invention provides a means for performing sym- 5 
bolic resolution of segments within modules even when 
the entire module has not been loaded or trace records 
for all of the segments of a module are not available. 
[0160] In the exemplary embodiments described 
above, the trace information is written to the trace file, 
or MTE file, when segments of modules are loaded and 
unloaded. Thus, there are redundant entries for each 
segment that may be eliminated and still be able to per- 
form symbolic resolution. By removing these redundant 
entries, the size of the trace file may be greatly reduced. 
[0161] In some systems, one may be able to update 
the kernel to add a status field associated with loaded 
modules. Alternatively, a kernel extension may be used 
to provide this status field. 

[0162] With the present invention, when a module is 
loaded, the updated kernel has associated with the 
module a "used (or referenced)" trace flag that is asso- 
ciated with the pid of the module. When the module is 
loaded, this flag is cleared to zero. Thus, the flag indi- 
cates that the module has been loaded but has not yet 
been used or referenced. 

[0163] As an example when running a time profiling 
application, when a time profile trace hook is encoun- 
tered during the trace, the "used" flag for the interrupted 
module on the interrupted pid is set to one by the trace 
program. When the module is unloaded, the modified 
kernel can check the "used" flag (hereafter, called UF) 
to determine if it has been set. If the UF is set, the trace 
program can output MTE trace records associated with 
the module prior to unloading the module. Similarly at 
the end of the trace all loaded modules may be check 
and each with the UF set may have MTE trace records 
written. During post processing, symbolic information is 
collected for all modules that have MTE data records; 
that is, for all modules for which trace data references 
exist. 

[01 64] While postprocessing the trace and attempting 
to perform symbolic resolution, the trace records are 
processed sequentially, searching for the first MTE entry 
after the trace reference. From this MTE entry and the 
symbolic information in the merged symbol file, the ad- 
dress to name resolution can be determined. In an al- 
ternative embodiment, at the first reference the MTE da- 
ta for the referenced module is written prior to writing 
the trace record. With this approach the post-processor 
does not have to search for the MTE data after the trace 
reference because it has already been read by the post- 
processor. 

[01 65] in a further alternative embodiment, a hash ta- 
ble may be created with a key to the hash table being 
the pid. The data in the hash table may include a list of 
modules associated with the pid. The hash table may 
include flags indicating a number of states of the module 



including whether the trace data for the module has al- 
ready being written, whether the module has been ref- 
erence before, whether the module has been loaded 
and used, and the like. These flags can be used in the 
same manner as the UF described above. In other 
words, based on the settings of these flags, a determi- 
nation can be made as to whether or not to write out the 
trace data to a trace file. In this way, the same type of 
scheme as described above can be developed by a ker- 
nel extension that does not modify the kernel datastruc? 
tures. 

[0166] Thus, in this further embodiment of the present 
invention, the number of trace records are reduced and 
thus, the trace file is minimized. By minimizing the size 
of the trace file, the amount of post-processing time is 
also reduced. In addition, by writing the module trace 
data prior to writing the trace record, the amount of 
searching performed by the post-processor is also re- 
duced, thereby making post-processing quicker. 
[0167] Figure 21 is a flowchart outlining an exemplary 
operation of the present invention in accordance with 
this further embodiment. As shown in Figure 21 , the op- 
eration starts with the initialization of the trace file upon 
starting a trace of a computer program. During initiali- 
zation, initial loaded module data, for example, MTE da- 
ta, Is written out to the trace file for those processes and 
methods that are loaded at the start of the trace (step 
2110). A hash table is constructed for all the currently 
loaded process ids and the associated modules (step 
21 20). This involves creating an entry into the hash table 
for each pid and hanging off of the pid a list of modules 
associated with the pid. Module information, such as ad- 
dress and, optionally, the length of the module, may be 
included in the hash table. 

[01 68] Each module in the hash table further includes 
a trace data flag that indicates whether the trace data 
for that module has been written to the trace file or trace 
buffer. Upon initialization, since all of the entries in the 
hash table correspond to processes and methods that 
have been written to the trace file or trace buffer in step 
2110, the trace data flags for these entries are set to 
true (step 2130). 

[0169] The trace is then executed (step 2140) and a 
determination is made as to whether a MTE trace hook 
is encountered during the trace (step 2150). If not, a de- 
termination is made as to whether a profile hook is en- 
countered (step 2160). If a profile hook is not encoun- 
tered, the trace is continued by returning to step 2140. 
If a profile hook is encountered, the module in which the 
profile hook is encountered is looked-up by pid and mod- 
ule address in the hash table (step 2170). A determina- 
tion is then made as to whether the trace data flag for 
the module has been set to false, i.e., the trace data has 
not been written to the trace file or trace buffer (step 
2180). If the trace data flag is false, the trace data is 
written out to the trace file and the trace data flag is set 
to true (step 2190). Thereafter, or if the trace data flag 
is true in step 2180, the profile hook trace data is written 
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to the trace file (step 2200). The trace may then continue 
if desired (step 2300). 

[0170] If in step 2150 a MTE hook is encountered, the 
hash table is searched for the pid associated with the 
MTE hook (step 2210). A determination is made as to 5 
whether an entry for the pid is present in the hash table 
(step 2220). If not, an entry is added to the hash table 
using the pid (step 2230). 

[01 71 J Thereafter, or if an entry based on the pid is 
found in the hash table, the hash table is searched for w 
the module address associated with the MTE hook (step 
2240). A determination is then made as to whether a 
module entry based on the module address was found 
(step 2250). If not, a module entry is added to the hash 
table using the module address (step 2260). The addi- 15 
tion of the module entry is made in association with the 
pid in the MTE hook. 

[0172] If a module entry is found in step 2250, a de- 
termination is made as to whether a partial or complete 
overlay of the module entry is necessary (step 2270). If 20 
so, the module entry is overlayed with the module infor- 
mation associated with the MTE hook (step 2280). For 
a partial overlay, this may include adjusting the length 
of existing module entries associated with the pid and 
then inserting the module information associated with 25 
the MTE hook. For a complete module overlay, this may 
include deleting an existing module entry and replacing 
it with a new module entry based on the module infor- 
mation associated with the MTE hook. 
[0173] A partial or complete overlay may occur when, 30 
for example, a process is stopped and a new process is 
created using the same pid as the previous process. In 
such a case, the module entry may be overlayed with a 
new module entry. In an alternative embodiment, the 
trace file may contain a separate trace entry indicating 35 
the stopping of a process and the creation of a new proc- 
ess using the same pid. Thereafter, any further refer- 
ences to the pid will be resolved using the new module 
entry. 

[01 74] Thereafter, the trace data flag for the module 40 
is set to false (step 2290). A determination is then made 
as to whether to continue the trace (step 2300). If so, 
the operation returns to step 2140. Otherwise, the op- 
eration terminates. During post-processing, the MTE 
data for the file(s) is/are read in and used in time se- 45 
quence order. 

[0175] As described above, the functionality of the 
hash table for storing status flags and the like, may be 
performed by updating the kernel to add a status flag 
associated with loaded modules or by providing a kernel so 
extension. Similarly, a process local storage may be uti- 
lized for maintaining this status flag. Alternatively, a 
process control block of the operating system may be 
modified directly to maintain this status flag. 
[0176] It is important to note that while the present in- 55 
vention has been described in the context of a fully func- 
tioning data processing system, those of ordinary skill 
in the art will appreciate that the processes of the 



present invention are capable of being distributed in the 
form of a computer readable medium of instructions and 
a variety of forms and that the present invention applies 
equally regardless of the particular type of signal bear- 
ing media actually used to carry out the distribution. Ex- 



type media such a floppy disc, a hard disk drive, a RAM, 
and CD-ROMs and transmission-type media such as 
digital and analog communications links. 
[0177] The description of the present invention has 
been presented for purposes of illustration and descrip- 
tion, but is not intended to be exhaustive or limited to 
the invention in the form disclosed. Many modifications 
and variations will be apparent to those of ordinary skill 
in the art. The embodiment was chosen and described 
in order to best explain the principles of the invention, 
the practical application, and to enable others of ordi- 
nary skill in the art to understand the invention for vari- 
ous embodiments with various modifications as are suit- 
ed to the particular use contemplated. 
[0178] In summary, the invention provides an appara- 
tus and method for cataloging symbolic data for use in 
performance analysis of computer programs is provid- 
ed. The apparatus and method stores symbolic data for 
loaded modules during or shortly after a performance 
trace and utilizes the stored symbolic data when per- 
forming a performance analysis at a later time. A 
merged symbol file is generated for a computer pro- 
gram, or application, under trace. The merged symbol 
file contains information useful in performing symbolic 
resolution of address information in trace files for each 
instance of a module. During post processing of the 
trace information generated by a performance trace of 
a computer program, symbolic information stored in the 
merged symbol file is compared to the trace information 
stored in the trace file. The correct symbolic information 
in the merged symbol file for loaded modules is identi- 
fied based a number of validating criteria. The correct 
symbolic information for the loaded modules may then 
be stored as an indexed database that is used to resolve 
address information into corresponding symbolic infor- 
mation when providing the trace information to a display 
for use by a user. 



Claims 

1 . A method of verifying symbolic data for loaded mod- 
ules, comprising: 

reading trace data for a module; 

comparing the trace data with module symbolic 
data in a merged symbol file; and 

verifying that the trace data matches the mod- 
ule symbolic data in the merged symbol file 
based on one or more predetermined criteria. 
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2. The method of claim 1 , wherein the one or more 
predetermined criteria include one or more of a 
checksum, a timestamp, a fully qualified path, and 
a segment size. 

3. The method of claim 1 or claim 2, wherein the trace 
data is read from a trace buffer. 

4. The method of any of claims 1 to 3, wherein the 
trace data is read from a trace file written to a stor- 
age device. 

5. The method of any of claims 1 to 4, wherein the 
reading, comparing and verifying steps are per- 
formed dynamically as trace data is written to a 
trace buffer. 

6. The method of any of claims 1 to 4, wherein the 
reading, comparing and verifying steps are per- 
formed at a remote time from when the trace data 
is written to a trace file. 

7. The method of any preceding claim, wherein com- 
paring the trace data with module symbolic data in 
a merged symbol file includes comparing a check- 
sum and timestamp in the trace data with a check- 
sum and timestamp in the module symbolic data in 
the merged symbol file. 

8. The method of claim 7, wherein comparing the trace 
data with module symbolic data in a merged symbol 
file further includes comparing a fully qualified path 
in the trace data with a fully qualified path in the 
module symbolic data, if the checksum and times- 
tamp in the trace data does not match the checksum 
and timestamp in the module symbolic data or the 
checksum and timestamp in the trace data is not 
available. 

9. The method of claim 8, wherein comparing the trace 
data with module symbolic data in a merged symbol 
file further includes comparing a segment length in 
the trace data with a segment length in the module 
symbolic data, if the fully qualified path in the trace 
data does not match the fully qualified path in the 
module symbolic data. 

10. The method of any preceding claim, wherein the 
one or more criteria have an associated priority. 

11. The method of claim 10, wherein the one or more 
criteria include checksum and timestamp, a fully 
qualified path, and a segment size, and wherein the 
checksum and timestamp has a highest priority and 
the segment size has a lowest priority. 

12. The method of any preceding claim, wherein the 
merged symbol file includes a plurality of module 



entries and wherein comparing the trace data with 
module symbolic data in a merged symbol file in- 
cludes identifying a module entry that is a best 
match with the trace data. 

5 

13. The method of claim 12, wherein identifying a mod- 
ule entry that is a best match with the trace data 
includes comparing the trace data to each of the 
plurality of module entries and identifying one of the 

10 plurality of module entries as a best match, based 
on which of the one or more criteria is used to verify 
the module entry. 

14. The method of any preceding claim, wherein the 
15 trace data includes redundant information identify- 
ing a module for each segment of the module. 

15. The method of claim 14, wherein the redundant in- 
formation includes at least one of module check- 

20 sum, module timestamp and module fully qualified 
path. 

16. A method of displaying data for analysing a per- 
formance trace of a computer application, compris- 
es jng: 

reading module trace data from a trace file; 

reading module symbolic data from a symbolic 
30 data file; 

verifying that the module symbolic data corre- 
sponds to the module trace data; 

35 correlating the module symbolic data with the 

module trace data to generate correlated data; 
and 

displaying the correlated data. 

40 

17. An apparatus for verifying symbolic data for loaded 
modules, comprising: 

a trace data storage device; 

45 

a merged symbol file storage device; and 

a processor coupled to the trace data storage 
device and the merged symbolic data storage 

50 device, wherein the processor reads trace data 

for a module from the trace data storage device, 
compares the trace data with module symbolic 
data in merged symbol file read from the 
merged symbol file storage device, and verifies 

55 that the trace data matches the module sym- 

bolic data in the merged symbol file based on 
one or more predetermined criteria. 
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18. An apparatus for displaying data for analysing a 
performance trace of a computer application, com- 
prising: 

a trace data storage device; s 

a symbolic data storage device; and 

a processor coupled to the trace data storage 
device and the symbolic data storage device, 10 
wherein the processor reads module trace data 
from the trace data storage device, reads mod- 
ule symbolic data from the symbolic data stor- 
age device, verifies that the module symbolic 
data corresponds to th e module trace data, cor- 1 s 
relates the module symbolic data with the mod- 
ule trace data to generate correlated data, and 
displays the correlated data on a display de- 
vice. 

20 

19. A computer program product in a computer reada- 
ble medium for verifying symbolic data for loaded 
modules, comprising: 

first instructions for reading trace data for a 25 
module; 

second instructions for comparing the trace da- 
ta with module symbolic data in a merged sym- 
bol file; and 30 

third instructions for verifying that the trace data 
matches the module symbolic data in the 
merged symbol file based on one or more pre- 
determined criteria. 35 

20. A computer program product in a computer reada- 
ble medium for displaying data for analysing a per- 
formance trace of a computer application, compris- 
ing: 40 

first instructions for reading module trace data 
from a trace file; 

second instructions for reading module symbol- 45 
ic data from a symbolic data file; 

third instructions for verifying that the module 
symbolic data corresponds to the module trace 
data; 



50 



fourth instructions for correlating the module 
symbolic data with the module trace data to 
generate correlated data; and 

fifth instructions for displaying the correlated 
data. 



55 



20 



EP1 172 729 A2 




CLIENT 



202- 



208- 



PROCESSOR 



PROCESSOR 



SYSTEM BUS 



MEMORY 
CONTROLLER/ 
CACHE 



209- 



230- 



232- 



l/O 
BRIOGE 



LOCAL 
MEMORY 



212- 



GRAPHICS 
ADAPTER 



HARD 
DISK 



FIG. 2A 



206 
-210 



204 

o 



.i/o 

BUS 



214 

i 



PCI BUS 
BRIDGE 



218- 
222 



200 
/ 



216 
PCI BUS I 



MODEM 



NETWORK 
ADAPTER 



-220 



PCI BUS 
BRIDGE 



PCI BUS 



PCI BUS 
BRIDGE 

224 



PCI BUS 



226 



N 

228 



21 



EP1 172 729 A2 




22 



EP1 172 729 A2 



CO 




o u. 



-o 



o 




23 



EP 1 172 729 A2 



400- 



404- 



TRACE 
PROGRAM 




PROCESS 












BUFFER 




POST 
PROCESSOR 








4 ' 



•402 



-406 



i 

1 — H 

405-^1. 



— 

TRACE ' 
FILE [" 
_ 1 



C BEGIN ) 



FIG. 4 



500- 



502- 



504- 



INITIALIZATION 
PHASE 



PROFILING 
PHA SE 

T- 



COMPUTER 
PROGRAM 

EXECUTION 
STARTS 



COMPUTER 
PROGRAM 
EXECUTION 
ENDS 



POST-PROCESSING 
PHASE 

FIG. 5 



PERFORMANCE 
TRACE STARTS 



PERFORMANCE 
TRACE ENDS 



GENERATE 
MERGE FILE 



'2 

FIG. 







BUILD INDEXED 


DETERMINE 




DATABASE FOR 


LOADEO 


=o 


LOADED 


MODULES 




MOOULES USING 






MERGE FILE 



HI — 



24 



EP1 172 729 A2 



C BEGIN ) 



ALLOCATE BUFFER 

I 



TURN ON TRACE HOOKS 
I 



START TRACING 



RECEIVE TRACE DATA 



STORE TRACE 
DATA IN BUFFER 




SEND BUFFER 
CONTENTS TO FILE 



GENERATE REPORT 



T 



CInd) 
FIG. 7 



-700 
■702 
-704 

-706 

•708 



•712 



-714 



800- 
801- 



C BEGIN ) 



OBTAIN PROGRAM COUNTER 



OBTAIN INTERRUPTED 
THREAD ID 



804 



806- 




" INTERPRETER ^ N 0 
CODE? 



IDENTIFY METHOD BLOCK 
BEING INTERPRETED 



SEND TRACE INFORMATION 



T 



FIG. 8 



25 



EP1 172 729 A2 




930- 



940- 



950- 



DEBUG FILES 



NON-STRIPPED 
VERSION 
OF MODULE 



OTHER SYMBOLIC 
INFORMATION 
FILES 



FIG. 9 



MERGESYM HEADER 
-HEADER 

-CASE SENSITIVITY FLAG, SLASH CHARACTER 



-1001 



MERGED ELEMENT 
-MODULE NAME 



-1002 



MODULE HEADER 

-MODULE PATH, CHECKSUM, TIMESTAMP 



-1003 



SECTION HEADER 

-SECTION NAME, OFFSET, LENGTH 



-1004 



SYMBOLIC DATA 

-SYMBOLIC NAME. OFFSET, LENGTH 



-1005 



FIG. 10A 



26 



EP 1 172 729 A2 



S-1060 



1000 

Representotion of o Sample Merge File: 

* 1030 
Mergesym Heoder(0)^" lulu / 
1 020 Mer ^ ed elemen, (0): module nome: "foo" 

Module Heoder(O): path: "C:\WINNA" 
extension:"exe" 
checksum:0x000E42C3 
timestomp:0x36224C0A 
Section Heoder(0):name: ".text" 
offset.OxO 
Iength:0x00080000 
Symbolic Dota(O): symbol nome:"subroutiner 
offset:0x0 
Iength:0x00040000 
Symbolic 0ota(l): symbol nome:"subroutine2' 
offset:0x00040000 
Iength:0x00040000 
Section Header(1):nome:"PAGE" 

offset:0x00080000 
Iength:0x00008000 



>1040 



-1070 



1060 



1 



Module Heoder(1): poth:"C:\temp\" | | 
extension:"exe" 
checksum:0x0001BD3D 
timestomp:0x3600B326 
Section Heoder(0):ncme:".text" 
offset:0x0 
!ength:Ox00OC0O0O 
Symbolic Dota(0): symbol name:"subroutine1 
offset:0x0 

Iength:0x00040000 Jj-1050 
symbol name:"subroutine2' 
offset:0x00040000 [j-1070 
Iength:0x00040000 
symbol nome:"subroutine3" 
0ffset:0x00080000 
Iength:0x00040000 
Section Heoder(1):name:"PAGE" 

offset:0x00080000 
lengthLOx00008000 



Symbolic Ooto(l): 
Symbolic 0oto(2): 
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C:\WINNT\ 
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C:\temp\foo.exe(subrouline2) 
C:\temp\foo.exe(subroutine3) 
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